Finding Cars, Goddesses and Enzymes: Parametrizable Acquisition of Labeled Instances for Open-Domain Information Extraction

نویسندگان

  • Benjamin Van Durme
  • Marius Pasca
چکیده

A method is given for the extraction of large numbers of semantic classes along with their corresponding instances. Based on the recombination of elements clustered through distributional similarity, experimental results show the procedure allows for a parametric trade-off between high precision and expanded recall.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Outclassing Wikipedia in Open-Domain Information Extraction: Weakly-Supervised Acquisition of Attributes over Conceptual Hierarchies

A set of labeled classes of instances is extracted from text and linked into an existing conceptual hierarchy. Besides a significant increase in the coverage of the class labels assigned to individual instances, the resulting resource of labeled classes is more effective than similar data derived from the manually-created Wikipedia, in the task of attribute extraction over conceptual hierarchies.

متن کامل

A New Method for Improving Computational Cost of Open Information Extraction Systems Using Log-Linear Model

Information extraction (IE) is a process of automatically providing a structured representation from an unstructured or semi-structured text. It is a long-standing challenge in natural language processing (NLP) which has been intensified by the increased volume of information and heterogeneity, and non-structured form of it. One of the core information extraction tasks is relation extraction wh...

متن کامل

Turning Web Text and Search Queries into Factual Knowledge: Hierarchical Class Attribute Extraction

A seed-based framework for textual information extraction allows for weakly supervised acquisition of open-domain class attributes over conceptual hierarchies, from a combination of Web documents and query logs. Automaticallyextracted labeled classes, consisting of a label (e.g., painkillers) and an associated set of instances (e.g., vicodin, oxycontin), are linked under existing conceptual hie...

متن کامل

Weakly-Supervised Acquisition of Open-Domain Classes and Class Attributes from Web Documents and Query Logs

A new approach to large-scale information extraction exploits both Web documents and query logs to acquire thousands of opendomain classes of instances, along with relevant sets of open-domain class attributes at precision levels previously obtained only on small-scale, manually-assembled classes.

متن کامل

Weakly-Supervised Acquisition of Open-Domain Classes and Class Attributes from Web Documents and Query Logs

A new approach to large-scale information extraction exploits both Web documents and query logs to acquire thousands of opendomain classes of instances, along with relevant sets of open-domain class attributes at precision levels previously obtained only on small-scale, manually-assembled classes.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008